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Abstract — We study the problem of maximizing the broadcast 
rate in peer-to-peer (P2P) systems under node degree bounds, i.e., 
the number of neighbors a node can simultaneously connect to 
is upper-bounded. The problem is critical for supporting high- 
quality video streaming in P2P systems, and is challenging due to 
its combinatorial nature. In this paper, we address this problem 
by providing the first distributed solution that achieves near- 
optimal broadcast rate under arbitrary node degree bounds, 
and over arbitrary overlay graph. It runs on individual nodes 
and utilizes only the measurement from their one-hop neighbors, 
making the solution easy to implement and adaptable to peer 
churn and network dynamics. Our solution consists of two 
distributed algorithms proposed in this paper that can be of 
independent interests: a network-coding based broadcasting al- 
gorithm that optimizes the broadcast rate given a topology, and a 
Markov-chain guided topology hopping algorithm that optimizes 
the topology. Our distributed broadcasting algorithm achieves 
the optimal broadcast rate over arbitrary P2P topology, while 
previously proposed distributed algorithms obtain optimality 
only for P2P complete graphs. We prove the optimality of our 
solution and its convergence to a neighborhood around the 
optimal equilibrium under noisy measurements or without time- 
scale separation assumptions. We demonstrate the effectiveness 
of our solution in simulations using uplink bandwidth statistics 
of Internet hosts. 

I. Introduction 

Peer-to-peer (P2P) systems have provided a scalable and 
cost effective way for streaming video in the past decade. 
Recent studies lfTTl - lfT4l . however, indicate that the practical 
performance of P2P streaming systems can be far from their 
theoretical optimal. 

There have been work studying the performance limit of 
P2P systems to understand and unleash their potential. One 
focus is on the streaming capacity problem lfT5l in P2P live 
streaming systems , i.e., maximizing the streaming rate subject 
to the peering and overlay topology constraints. The problem is 
critical for supporting high-quality video, which is determined 
by the streaming rate, in P2P live streaming systems. In this 
paper, we focus on the broadcast scenario where all peers in 
the system are receivers. 

The case of unconstrained peering on top of a complete 
graph is well studied, where the maximum broadcast rate 
is derived in several papers (TJ-Q, lfl6l . ifTTl . The case 
of unconstrained peering over general graph can also be 
addressed by using a centralized solution |5). 

The streaming capacity problem becomes NP-Complete 
over general graph with node degree bounds [TO]. Node degree 



is defined as the number of simultaneous active connections 
that a node maintains with its neighbors. Due to connection 
overhead costs, it is necessary to limit the number of simulta- 
neous connections a peer can maintain. This naturally bounds 
the node degrees in P2P systems. For instance, in practical 
systems such as PPLive lfT8l . the total number of neighbors 
of a node is usually bounded around 200, and the number 
of active neighbors of a node is usually bounded by 10-15 
03) ■ In such large P2P systems with hundreds of thousands 
of peers, the system topology is not a complete graph. 

There has been work studying this challenging problem 
of maximizing streaming rate under node degree bounds 
and over general P2P graph. SplitStream/CoopNet (6), 0, 
ZIGZAG 0, PRIME and most practical systems (such 
as PPLive fH) and UUSee (HI) bound node degree but do 
not provide rate optimality guarantee. Recently, the authors 
in ifTUl proposed a centralized Cluster-Tree algorithm that 
achieves near-optimal broadcast rate with high probability over 
complete graph, under the assumption that the node degree 
bound is at least logarithmic in the size of the network. A 
summary and comparison of previous work and this work are 
in Table U 

Despite of these exciting results, the following two impor- 
tant questions remain open: 

. What is the maximum broadcast rate under arbitrary node 

degree bounds, and over general P2P overlay graph? 
. How to achieve the maximum broadcast rate in a dis- 
tributed manner? 
Systems running distributed algorithms, compared with those 
running centralized algorithms, are more adaptable to peer 
churn and network dynamics. 

In this paper, we answer the above two questions and make 
the following contributions: 

• We provide the first distributed solution that achieves 
a broadcast rate arbitrarily close to the optimal under 
arbitrary node degree bounds, and over arbitrary overlay 
graph. Our solution runs on individual nodes and utilizes 
only the information from their one-hop neighbors. 

Our solution consists of the following two algorithms that can 

be of independent interests. 

. We propose a distributed broadcasting algorithm that 
achieves the optimal broadcast rate over arbitrary overlay 
graph. Previous distributed P2P broadcasting algorithms 



TABLE I 

Summary and comparison of previous work and this work for maximizing P2P broadcast rate. 
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* The Cluster- Tree algorithm is (1 - e)-optimal with high probability if the node degree bound is 0(log AQ. 



are optimal only for complete overlay graph fH-151. Our 
algorithm is based on network coding and utilizes back- 
pressure arguments. 

. We also propose a distributed algorithm that optimizes the 
topology. In this algorithm, each node hops among their 
possible set of neighbors towards the best peering con- 
figuration. Our algorithm is inspired by a set of log-sum- 
exp approximation and Markov chain based arguments 
expounded in 1201 . 

. We prove the optimality of the overall solution. We also 
prove its convergence to a neighborhood around the op- 
timal equilibrium in the presence of noisy measurements 
or without time-scale separation assumptions. We demon- 
strate the effectiveness of our solution in simulations 
using uplink bandwidth statistics of Internet hosts. 

II. Problem Formulation 
A. Settings and Notations 

We model the P2P overlay network as a general directed 
graph G = (V,E), where V denotes the set of nodes and E 
denotes the set of links. Each link in the graph corresponds 
to a TCP/UDP connection between two nodes. Let N v denote 
the neighbor set of node v € V in the graph. Each node v e V 
is associated with an upload capacity C v > 0. We assume 
there is no constraint on the downloading rate for each node 
v e V. This assumption can be partly justified by the empirical 
observation that as residential broadband connections with 
asymmetric upload and download rates become increasingly 
dominant, bottlenecks typically are at the uplinks of the access 
networks rather than in the middle of the Internet. 

As such, P2P networks have capacity limits on the nodes 
instead of links. This is different from traditional underlay 
networks where the capacity limits are on the links. 

We focus on the single-source streaming scenario, i.e., a 
source s broadcasts a continuous stream of contents to the 
entire network; we denote its receiver set as R = V - {s}. 

We consider the peering constraints that each node has a 
degree bound B v , i.e., it can only exchange streaming content 
with up to a B v number of neighbors simultaneously due to 
connection overhead cost. We allow different nodes to have 
different degree bounds. Fig. Q] shows four sample peering 
configurations of a 5 -node network with node degree bound 3 
for each node. 




(a) 



(b) 




(c) 



(d) 



Fig. 1. Peering configuration examples for a 5-node network with node 
degree bound 3 for each node. 



Let T denote the set of all feasible peering configurations 
over graph G under node degree bounds. Given a configuration 
/ € T, we obtain a connected sub-graph of G that satisfies 
the node degree bound constraints. We denote this sub-graph 
as Gf = (y, Efj, where Ef represents the set of links in this 
sub-graph. We denote N v j as the set of node v's neighbors in 
this sub-graph. We have \N v j\ < B v where | • | represents the 
size of a set. 

B. Problem Formulation and Our Approach 

For a configuration / e f", let Xf be the maximum 
achievable broadcast rate under /, i.e., the highest rate at 
which every node in the system can receive the streaming 
content simultaneously. The problem of maximizing broadcast 
rate under node degree bounds can be formulated as follows: 



MRC : maxf e f Xf. 



(1) 



This problem is combinatorial in nature which is known 
to be NP-complete 1101 . and there is no efficient approximate 
solution to the problem even in a centralized manner. 

In this paper, we address this problem by providing a 
distributed solution. In particular, we first develop a distributed 
broadcasting algorithm that can achieve Xf under arbitrary 
/ e f. We then design a distributed algorithm that opti- 
mizes towards the best peering configurations. They operate 



in tandem to achieve a close-to-optimal broadcast rate under 
arbitrary node degree bounds, and over arbitrary overlay graph. 
We elaborate on these two algorithms in the following two 
sections. 

III. The Proposed Distributed Broadcasting Algorithm 

By exploiting network coding ll2"Tll . we design a back- 
pressure based distributed broadcasting algorithm. Back- 
pressure type algorithm is proposed initially in |22| . This type 
of algorithms select a subset of queues in the system with 
the maximum back-pressures and serve these queues subject 
to resource constraints, where back-pressure is defined as the 
difference between the queue at the local node and that of its 
downstream nodes. Back-pressure algorithm design has found 
applications in many network resource allocation domains 
1231 , 1241 . (25). In this paper, we apply this method for the 
first time to design distributed P2P broadcasting algorithm. 
Our algorithm can achieve the maximum broadcast rate over 
arbitrary P2P topology. 

A. Routing vs. Network Coding 

In P2P systems, there are two approaches for broadcasting 
contents: one is based on routing l26l . in which nodes only 
store and forward packets; and the other is based on network 
coding ETI . l26l . in which a node is also allowed to mix 
information and output data as functions of the data it received. 
Some commercial P2P systems are built upon routing-based 
approach (e.g., PPLive [18]), and some are based on network 
coding (e.g., UUSee QU, USUI- It is known that both 
routing and network coding approaches can achieve optimal 
broadcast rate over arbitrary P2P graph 0, ifTTl . Compared to 
routing-based approach, the network-coding based approach 
introduces additional packet header overhead for carrying 
coding coefficients (e.g., 3% extra overhead according to 
|29l ) and computation complexity for encoding and decoding 
(e.g., |fl~3), l27l discuss how to keep the complexity low). 
However, the network-coding based approach is robust to 
peer dynamics since there is no need for constructing and 
maintaining the spanning trees. In this section, we design a 
distributed broadcasting algorithm based on network coding 
that is robust to dynamics. In Section I VIII we will discuss 
how the overall problem can be solved by using centralized 
solutions when only routing is allowed. 

B. Network Coding Based Formulation 

According to the Max-Flow-Min-Cut theorem, a data trans- 
mission of rate z between source s and a receiver d is feasible 
if and only if there exists a flow, denoted as f d , satisfying the 

'We refer interested readers to [27], [281 for more details on performance 
of routing-based and network-coding-based practical P2P systems. We focus 
on optimal distributed P2P broadcasting algorithm design based on network 
coding in this paper. 



following flow conservation constraints: 

X ft < X ft Vv efl-tf}, (2) 

uEm(v) ueoui(v) 

z < Yu fi» (3) 

ueoitt(s) 

< f d , (4) 

where in(v) — {u\(u, v) e Ef) is the set of nodes sending content 
to v under configuration /, and out(v) = [u\(v, u) e Ef, u + s) 
is the set of nodes receiving content from v. 

A powerful theorem established in (21 J states that a multi- 
cast or broadcast rate z from s to a set of receivers is achievable 
if and only if z is feasible for s and any receiver d. This is 
a strong result as it says that if the network can support a 
unicast rate of z between s and any receiver assuming other 
receivers' traffic is absent, then it can support a multicast rate 
of z to all the receivers simultaneously. Such rate z can be 
achieved by every node in the network performing network 
coding ETI . Further, authors in 11291 . Il30l show that it is 
sufficient to perform random linear network coding. 

In random linear network coding, by independently and ran- 
domly choosing a set of coding coefficients from a finite field, 
each node sends out the coded packet as a linear combination 
of the node's received packets. The combination information is 
specified by a coefficient vector in the packet header, which is 
updated by applying the same linear transformations as to the 
data. When one node receives a full set of linearly independent 
coded packets, it can decode and recover the original packets. 
In this paper, we focus on the distributed algorithm design. 
The discussions of decoding probability and implementation 
details can be found in 11291 . Il30l . 

Under the setting of network coding, we can consider f d 
as a "virtual" information flow between s and d. Multiple 
information flows "piggyback" together to transmit over the 
physical links. The actual physical rate over a physical link is 
only the maximum rate of individual information flows passing 
over it. Let g uv be the physical flow rate over a link (u, v) e Ef, 
then we have fjf v < g uv for all d e R. 

With the above understanding, we formulate the problem of 
maximizing broadcast rate under configuration / as follows: 

MP : max U(z) (5) 

z,f,g>0 

s.t. J] fi + Z l {v=s} < J] ft Vv e V - {d}, d e R(6) 

uein(v) ueout(v) 

fi < g vu , Vv € V, Vw e out(v), deR, (7) 
J] g vu <C v ,VveV, (8) 

ueoittiy) 

where U (z) is a twice-differentiable strictly concave utility 
functional, !(.} denotes the indicator function. The constraints 

2 It might seem unnecessary to involve a strictly concave utility function in 
this formulation. The reason is that we later design a primal-dual algorithm 
to solve the problem, and using a strictly concave utility function can avoid 
its potential instability problem 1171 . 



in (O describe the flow conservation requirements. The con- 
straints in (01 come from the piggybacking property of infor- 
mation flows. The node upload capacity constraints are in (|8). 
The problem MP is a convex problem. All feasible broadcast 
rates must satisfy the constraints in (|6]l-([8]l and are achievable 
by using random linear network coding. 

C. Algorithm Design via Lagrange Decomposition 

To proceed, we first relax the first set of constraints in (|6]i 
in problem MP to obtain a partial Lagrangian as follows: 



L(z,f,g,A) 



As such, it is sufficient to study the following problem in g: 



max s > 



s.t. 



i Wvu 



veV ueoul(y) 

2 g vu <C v ,VveV, 



(15) 



ueout(v) 



where 



v vu - Zt^ ~ A »,d] + , V(m, v) e E f . 



(16) 



deR 



denotes the aggregate back-pressure between two neighboring 
nodes u and v, and [-] + - max(-,0). 
For any v e V, let 



veV-{d) deR \uein(v) 

=C/fe)-ZZ^ 

veV deR \uein(v) 



u*(v) = arg max w v „ 



(17) 



uBoutiy) j 
\ 



«GOH/(l') 



(9) 



be one of its neighbors with the maximum back-pressure 
(breaking ties arbitrarily). Then one optimal solution for 
problem SSP is as follows: 



where A v4 , v e V - [d], d e R are Lagrange multipliers, A^.d = 
0,W/efl, and I aei „ (s) = 0. 

The strong duality holds for problem MP since the Slater 
conditions are satisfied 0311 . Therefore, we can solve problem 
MP by finding the saddle points of L(z,f,g,A). 

Noticing that 



{sty 



and 



C v , if M = M*(v), 

0, otherwise, 



|0, if A v4 -A u4 <0, 
\e*, otherwise. 



(18) 



(19) 



Z Z ^v,dZl{v=s} = ^Z ^ s ' d 



(10) 



1>SV deR 



deR 



and 



Z Z Ay ' d Z ~ Z = Z Z Z f™^ u > d ~ Av ' d ^ 

(ID 



*.H€in(v) K€QMf(v) > 



Given /* and g*, primal-dual algorithms can be designed to 
adapt z and A to pursue the desired optimal solution. 

We summarize the above analysis into a distributed algo- 
rithm including the following components: 

Primal-dual Rate Control: we pursue the saddle point in 
z and A simultaneously as follows: 



deR veV ueout(v) 



we can find the saddle points of L{z,f,g,A) by solving the 
following problem successively in z, f, g, A: 



z = a[U'(z)-Z d€R A s4 ]+, 

Av,d = Kd [Z«ein(v) (/«v) + Zl r . 

~ Yjiieout(v) \fvu) j 

{Ad 4 = A dA = 0, VdeR, 



mm 

A>0 



max(£/(z) -zV A s4 ) + max V V V f? u (A v4 - A u4 ) 

deR deR veV ueout(v) 

(12) 

s.t. CD-©. 



Given A and z, we consider the following scheduling sub- 
problem on /, g: 

SSP:max /)g > ZZ Z (13) 

de« veV ueout(v) 
S.t. CJ-@. 

The above linear programming problem has a structure that 
allows us to solve it distributedly. The first observation is that 
if an optimal g* is given, then an optimal /* can be obtained 
as follows: Vm, v e V.d e R, 



Vve V-{d},d€R, 
(20) 



where a and k v4 are positive step sizes, and the function 
(max(0, b), a < 0, 



IK = 



a > 0. 



(/£)* 



0, if A v4 - A u4 < 0, 
g* nl , otherwise. 



(14) 



Neighbor Scheduling, Content Scheduling, and Network 
Coding: Every node v eV maintains a queue storing packets 
that are intended for d. Whenever a transmission opportunity 
arises, node v chooses one neighbor u*(y) with the maximum 
back-pressure according to ( TTTl i. 

If w vu *(y) > 0, node v sends packets to u*(v) at rate C v . Every 
output packet is constructed as follows. Node v chooses one 
packet from the head of each queue of d if A v4 - A U ^ V ) 4 > 0, 
and output one random linear combination of these heard-of- 
queue packets. If otherwise w m *( V ) < or there is no head-of- 
line packets to code, node v does nothing. 

We have the following observations. 

. The Lagrangian variable A v4 is proportional to the length 
of queue storing packets that are intended for receiver d. 



The back-pressure w vu measures the aggregate difference 
in the queues of all d e R between v and u. The larger 
the back-pressure is, the more desperate node u wants to 
receive data from v. 

. Our algorithm can be implemented in a distributed man- 
ner. It only requires nodes to exchange information with 
its one-hop neighbors, and thus is robust to peer churn and 
system dynamics. When a new peer arrives, it connects 
to a set of neighbors, assigned by the streaming server 
or trackers. Then the peer starts exchanging streaming 
data with them following the strategy defined by our 
algorithm. When a peer leaves, its neighbors are informed 
and then close the connections. For the network coding 
operation, theoretically we need to adjust the size of field 
where the coding coefficients are chosen to make sure 
of the decoding probability when the number of nodes 
changes 03. ED- While (29] and 1 13| show that in 
practice the finite field F 2 s or F 2 i6 is enough to have a suf- 
ficiently high decoding probability. Therefore, only local 
configuration changes corresponding to dynamics, which 
is easy to implement compared to centralized algorithms 
where typically global information is needed for whole 
configuration change (e.g., spanning trees reconstruction 
in spanning tree based solutions). 

. Although our algorithm is designed for P2P broadcast 
scenarios, it also works for P2P multicast scenarios where 
helper nodes exist. The helper nodes simply also perform 
the operations described in <fT~8T> - (f20b . Our algorithm can 
be considered as the extension of the algorithm in QUI 
from link-capacity-limited underlay networks to node- 
capacity-limited overlay networks. 

The following theorem characterize the convergence of the 
proposed algorithm. 

Theorem 1: The algorithm in (TT~8T>-(f20b converges to the 
optimal solution of problem MP globally asymptotically in 
time. 

The proof utilizes standard Lyapunov arguments and a Lya- 
punov function for primal-dual algorithm, similar to those used 
in ifTTl . 11341 . The proof is relegated to Appendix l-AI 

Remark: We derive our algorithm and prove its conver- 
gence based on a fluid model formulation. It is also possible 
to obtain a similar back-pressure based distributed algorithm 
with packet-level dynamics taken into account and prove 
its stability, following a set of Lyapunov drift arguments 
elaborated in 



IV. The Proposed Distributed Topology Hopping Algorithm 



We recently proposed in 112011 to use Markov chain as a 
principled approach in designing distributed algorithms for 
solving combinatorial network problems approximately. In 
particular, we show one can design distributed algorithms 
for a combinatorial network optimization problem in the 
following way. First, construct a special class of Markov 
chains with problem-specific steady-state distribution. Second, 
search for a Markov chain in this class that allows distributed 
implementation. If such Markov chain can be found, which 



is usually challenging and problem-specific, the distributed 
implementation directly yields a distributed algorithm for the 
problem. 

In this paper, we follow the framework from 12011 and design 
a distributed topology hopping algorithm for our problem ([TJ. 
There are two steps in designing our algorithm under the 
Markov approximation framework 11201 : log-sum-exp approx- 
imation and constructing problem-specific Markov chains that 
allows distributed implementation. 

A. Log-Sum-Exp Approximation 

First, the maximum broadcast rate can be approximated by 
a log-sum-exp function as follows: 



max Xf 



log 



Yi ex p {p*f) 



(21) 



where fi is a positive constant. Let \T\ denote the size of the 
set f, then the approximation accuracy is known as follows 
120): 



0< ilog 



2 exp (/?*/) 



max Xf < — log |y|. 

fer fi 



(22) 



As fi approaches infinity, the approximation gap approaches 
zero. As discussed in [20], however, usually fi should not 
take too large values as there are practical constraints or 
convergence rate concerns in the algorithm design afterwards. 

To better understand the log-sum-exp approximation, we 
associate with each configuration / e f a probability pj. 
Consider the following problem 



MRC - EQ 



max 

p>0 



s.t. 



fer 

2> =L 



(23) 



(24) 



/gr- 



its optimal value is msXf^Xf and is obtained by setting the 
probability corresponding to one of the best configurations to 
be one and the rest probabilities to be zero. Hence, problem 
MRC - EQ is equivalent to the original problem MRC. 

On the other hand, according to |20l we have the following 
observations. 

Theorem 2 (cf. |]20l): The optimal value of the following 
optimization problem 



s.t. 



MRC - fi : max J] p fXf - ^ p f log p f 

P ~ fer P 

fer 



fer 



(25) 



(26) 



is given by ilog|£/E. F ex p(/? x /)]- The optimal solution of 



problem MRC 



P* f (x) 



fi is given by 

exp (px f ) 



2 exp ( fix f ) 



(27) 



As such, by the log-sum-exp approximation in (|2TT i. we 
obtain an approximate version of the maximum broadcast rate 
problem MRC, off by an entropy term — i Yifer Pf^°E Pf- If 
we can time-share among different configurations according 
to the optimal solution p*Jx) in d27l i. then we can solve the 
problem MRC approximately and obtain a close-to-optimal 
broadcast rate. 

B. Markov Chain Guided Algorithm Design 

We design a Markov chain with a state space being the set 
of all feasible peering configurations f and has a stationary 
distribution as p*Jx) in (l27T i. We implement the Markov 
chain to guide the system to optimize the configuration. As 
the system hops among configurations, the Markov chain 
converges and the configurations are time-shared according 
to the desired distribution p*Jx). 

The key lies in designing such Markov chain that allows 
distributed implementation. Since Pf(x) in ( f2Tb is product- 
form, it suffices to focus on designing time-reversible Markov 
chains l20l . 

Let /,/' e F be two states of Markov chain, and denote 
q f f as the transition rate from state / to / . We have two 
degrees of freedom in designing a time-reversible Markov 
chain: 

. The state space structure: we can add or cut direct 
transitions between any two states, given that the state 
space remains connected and any two states are reachable 
from each other. 

. The transition rates: we can explore various options in 
designing q f f, given that the detailed balance equation 
is satisfied, i.e., 



p}(x)q fif =p}(x)q rj , V/,/' € r. 



(28) 



Satisfying the above equations guarantees the designed 
Markov chain has the desired stationary distribution as in 
<E). 

Recall that for a node v e V, the set of its neighbors under 
configuration / is denoted by N v j. We call node in N v j v's 
in-use neighbor and node in N v \N v j v's not-in-use neighbor. 
For the ease of explanation, we further define Nt as the set of 
all the node-pairs under /, i.e., Nt = {{v, u\, Vv e V, u e N V A. 
Note we do not differentiate node pairs {«, v} and {v, u). As an 
example, for the peering configuration / shown in Fig. |TJb), 
Nf is given by [{s, l},{i,2},{i,4},{l,2},{l,4},{2,3},{3,4}}. 

In our Markov chain design, we first specify its state space 
structure as follows: we set the transition rate qjji to be zero, 
unless / and /' satisfy that \Nf\Nf\ = 1 or \Nf\Nf\ = 1. 
In other words, we only allow direct transitions between two 
configurations if such transitions correspond to a single node 
adding a new node in its in-use neighbor set or removing one 
in-use neighbor from its in-use neighbor set. 

Second, given the state space structure of Markov chain, we 
design the transition rates to favor distributed implementation 
while satisfying the detailed balance equation in 1281 . 

One possible option is to set q^ to be exp~ l (Bx/). One 
way to implement this option is for every node to generate 



a timer according to its measured receiving rate and counts 
down accordingly. When the timer expires, the dedicated node 
performs the neighbor swapping and resets its timer. As simple 
as the implementation may sound, this option is expensive 
to implement. Once the peering configuration changes, the 
system needs to notify all the nodes to measure the new 
receiving rate and reset their timers accordingly. It is not clear 
how to implement such system-wide notification in a low- 
overhead manner. 

In this paper, we design qj j-> and qfj as follows: 

1 exp(gy) 



and 



a f'.f = 



exp(r) exp(J3xf) + exp(Bxj> ) 
exp(r) exp(Bxj ) + exp(Bxf ) ' 



(30) 



where t is a constant. It is straightforward to verify that 
detailed balance equation is satisfied. As will be clear in the 
next subsection, our choices of transition rates do not require 
coordination or notification among peers in its implementation. 

C. Distributed Implementation 

One distributed implementation of our designed Markov 
chain is briefly described as follows. 

. Initialization: Each peer v e V randomly selects neigh- 
bors from its neighbor list N v under the node degree 
bound and builds connections with these selected neigh- 
bors. 

• Step 1: Let / denote the current configuration. Each 
node v e V generates an exponentially distributed random 
number independently with mean "^j T \ and counts 
down according to this number. 

• Step 2: When the count-down expires, node v measures 
its current receiving rate as an estimate of the broadcast 
rate x/. Then with probability node v goes to the 

Step 2a; with probability J^""^ , node v goes to the 
Step 2b; 

- Step 2a: Node v randomly selects one in-use neigh- 
bor in N v j and removes it from N v j. Under the new 
peering configuration / , node v measures its receiv- 
ing rate as an estimate of Xf, With the estimates of 
Xf and Xf, peer v stays in the new configuration / 

with probability ^ /H J p09] y , , 



and switches back to 



/ with probability 1 - 



Node v then 



repeats Step 1. 
- Step 2b: Node v randomly selects one not-in-use 
neighbor in N v \N v j. If the node degree of the se- 
lected not-in-use node is equal to the bound or v's 
node degree is equal to the bound, node v jumps 
back to Step 1 immediately. Otherwise, node v adds 
this selected node into N v j. Under the new peering 
configuration f , node v measures its receiving rate 
as an estimate of Xf. With the estimates of xj and 
Xf, peer v stays in the new configuration / with 



probability 



exptfix j ) 

— r, and switches back to / 



Algorithm 1 Broadcasting Algorithm 



with probability 1 - 



Node v then 



expfjSx/ )+exp(/3jy ) ' 

repeats Step 1. 

It is straightforward to summarize the above implementation 
into a distributed algorithm that runs on individual nodes and 
utilizes only the measurement from their one-hop neighbors. 
The correctness of the implementation is shown as follows: 

Proposition 1: The implementation in fact realizes a time- 
reversible Markov chain with stationary distribution in 127V 
The proof is relegated to Appendix [-B1 

Remarks: a) In Step 1, the generation of count-down 
timers does not depend on the receiving rate, thus the system 
does not need to notify the nodes about changes of peering 
configurations, b) With the above implementation, the system 
hops towards configurations with better broadcast rate prob- 
abilistically. For example, if Xf > xj, then the system will 
be more likely to stay in configuration /' than in /, and vice 
versa, c) With large values of f3, the system hops towards 
better configurations more greedily. However, this may as 
well lead to the system getting trapped in locally optimal 
configurations. Hence there is a trade-off to consider when 
setting the value of /3. Moreover, the value of f3 also affects 
the convergence rate of the time-reversible Markov chain to the 
desired stationary distribution. It is worth future investigation 
to further understand the impact of (3. d) In the presence of 
peer dynamics, our algorithm incurs only simple actions based 
on local information. When a new peer arrives, a neighbor 
set and a neighbor list are assigned to it. The peer builds 
connections with the nodes in the neighbor set. Then the peer 
starts counting down as Step 1 and follows the strategy of our 
algorithm. When a peer leaves, we just eliminate it from the 
neighbor list of its previous neighbors and end up connections. 

V. Convergence Properties of Overall Solution 

We have designed the distributed broadcasting algorithm in 
Section [III] and the Markov chain guided topology hopping 
algorithm in Section [IV] The pseudocodes of each algorithm 
are shown in Algorithm [TJ and Algorithm [2] respectively. Both 
algorithms are simple to implement, run on each individual 
node, and only require nodes to exchange information with 
their neighbors. 

If the broadcasting algorithm converges instantaneously, i.e., 
time-scale separation assumption holds, then we can obtain the 
accurate value of xy for any configuration f eT. Transiting 
based on the accurate xj, the designed Markov chain will 
converges to the desired stationary distribution in (1271 1, Hence 
by operating these two algorithms in tandem, we obtain a 
close-to-optimal broadcast rate under arbitrary node degree 
bounds, and over arbitrary overlay graph. The optimality gap 
is characterized in (l22l . 

In practice, however, it is possible to obtain only an in- 
accurate measurement or estimate of xj. These inaccuracies 
root in two sources. One is the noisy measurements of the 
maximum broadcast rates given the configuration. The other 
is the fast state transition of Markov chain, i.e., the Markov 



1: The following procedure runs on each individual node 

independently. 

2: For the source s and each time slot, 

3: X <- \x + a{U\x) - XdeR A s,d)] 

4: For each node v e V and each time slot, 

5: W* «- 

6: for u e out(v) do 

7: for for d e R do 

8: w m <- w vu + max(/l,, d - A u4 , 0) 

9: end for 
10: if w vu > w* then 
11: w* <— w vu 

12: U* <— U 

13: end if 

14: end for 

15: if w,,„. > then 

16: for d € R do 

17: if A V d - A u , d > then 

18: fi - C 

19: end if 
20: end for 

21: end if 

22: for d € R do 

23: A v d <— \A v d + &v,d(2jH£m(v) fuv ~ Yiu£out(\>) 

24: end for 



/™)f 



chain transits before the underlying broadcasting algorithm 
converges and thus it transits based on inaccurate observations 
of the broadcast rates. 

Consequently, the topology hopping Markov chain may not 
converge to the desired stationary distribution p*j(x). This 
observation motivates our following study on the convergence 
of Markov chain in the presence of inaccurate transition rates. 

For each configuration / e T with broadcast rate xj, we 
assume its corresponding inaccurate observed rate belongs to 
the bounded region [—Ay, Ay]. Ay is the inaccuracy bound and 
can be different for different /. 

For easy explanation of our approach, we further assume 
the observed broadcast rate for configuration / only takes one 
of the following 2n y + 1 discrete values: 



X f -A f ,...,Xf- —Ay, X f, Xf + -^Ay, . 

n f n f 



,Xf 



where tty is a positive constant. Further, with probability 
T]jj, the observed broadcast rate takes value xj + ^-Ay, 
V; € {-ny, . . . , nyj and £"L n/ rjj.f = 1- 

With the inaccurate observed broadcast rates, the topology 
hopping behaves as follows. Suppose the current configuration 
is / and the observed broadcast rate is Xf + ^Ay, where 
j e {-ny, . . . , n/\. After some count-down process, the system 
hops to a new configuration /' and probes its broadcast 
rate. In configuration /', the broadcast rate is observed as 
Xf + ^Ay,/ 6 {— nf, .. .,«/'}■ The system stays in the new 



Algorithm 2 Topology Hopping Algorithm 
1: The following procedure runs on each individual node 

independently. We focus on a particular node v e V. 
2: procedure Initialization 

• Initialize N v , B v ; randomly connects to peers from N v 
under the degree bound. 

• Generate a timer that follows exponential distribution 
with mean equal to 2 exp(T) /(\N V \) and begin counting 
down. 

end procedure 



3 
4 
5 
6 
7 
8 
9 

10 

11 

12 
13 

14 
15 

16 
17 
18 
19 

20: 

21: 
22: 



When the timer expires, invoke the procedure Transition, 
procedure Transition 
With probability 

N„ «- N vJ ; 

randomly remove one in-use neighbor from N v j\ 

x f *~ 2«£in(v) fuv' 

N v j <— N with probability 
1 - exp(J3xf)/(exp(fixf) + exp(J3xf)y, 
refresh the timer and begin counting down; 
With probability 1 - l ^f, 
N <- N,,f, 

randomly add one not-in-use neighbor v in 

N\N v ,f to N vJ - 

if |iV v ,/| = B v or \N v >j\ = B v > 

refresh the timer and begin counting down; 
end if 

N v j <— N with probability 
1 - exp(J3xf)/{exp(J3x f ) + exp(Bx f >)); 
refresh the timer and begin counting down; 
end procedure 



configuration /' with probability 

expQKxf. + ±A f y) 



exp(J3(x f + {pA f )) + exp(B(x f + j- f A f) ) 
and switches back to configuration / with probability 
exp(J3(xf, + {-A f )) 

1 ; ; . 

exp(J3(x f + {pA f )) + exp(B(x f + ^A/)) 

By arguments similar to the proof of Proposition 1, the tran- 
sition rate from configuration / with broadcast rate Xf + ^-Af 

to configuration /' with broadcast rate Xf + j£-;Af is given by 



exp(B{x f + {pA f ,)) 



exp(r) eX p(J3(x f + ±A f )) + exp(B(x f + ±A f y) 



(31) 



We construct a Markov chain to capture and study the above 
topology hopping behavior. In this Markov chain, a state is 
associated with a configuration and an observed broadcast 
rate. Given any configuration / e f and its correspond- 
ing Xf, there are 2nj + 1 states in the extended Markov 



Original Topology Hopping Markov Chain M 
with Exact Broadcast Rates 



Corresponding Extended Markov Chain M' 
with Inaccurately Observed Broadcast Rates 



< ?<l,x 1 -A,),(2,x 2 l 
( ?(1.A 1 -A ] ),(2,.v 2 +A 2 




Fig. 2. An example of the original three-state topology hopping Markov chain 
and the extended Markov chain. M is the original topology hopping Markov 
chain with accurate broadcast rates. M' is the corresponding extended Markov 
chain with inaccurate broadcast rate observations. For each configuration / e 
{1,2,3}, the observed broadcast rate takes values Xf - Ay, Xf, Xf + Ay with 
probability 17 -1/, lfaj and r^j respectively. The transition rates are assigned 
according to (32} and )33t . 



chain: ^f,Xf + ^A/j.j € {-«/, . . . ,«/}. Further, Given direct 
transitions between configuration / and /' in the original 
topology hopping Markov chain, there are direct transitions 
between states (f,x/ + ;f A/) and (f',Xf + A/>) (Vj e 
{-rif, . . . ,n/}, f e {-«/-,...,«//}) in the corresponding new 
Markov chain. The corresponding transition rates are shown 
as follows: 

rjfj, exp(B(x f + ^A f )) 



exp(r) expOSC^ + ^A f )) + exp(B(x f + ±A f )) 



(32) 



and 



exp(J3(x f + £A,)) 



exp(r) exp03(^ + ±A f )) + sxp(B{x f + ^A f )) 



(33) 



where zZjL- n Ij.f - 1 an d S/=- n , Hj'J' = 1- This new Markov 
chain can be thought as an extended version of the original 
topology hopping Markov chain. As an example, an extended 
Markov chain is shown and explained in Fig. [2] 

The extended Markov chain has a unique stationary distri- 
bution since it is irreducible and only has a finite number of 
states. We can study the impact of inaccurate broadcast rates 
by comparing the stationary configuration distribution of the 
new Markov chain and that of the original topology hopping 
Markov chain. 

We denote the stationary distribution of the states in the 
new Markov chain by 



P = [Pf 



J e {-n f ,...,n f },f e T\ 



(34) 



We also denote p : [pf(x),f e T] as the stationary distribution 
of the configurations in the extended Markov chain. Given a 
configuration /ef, there are 2«y + 1 states associated with 
/ in the extended Markov chain. We have 



TABLE II 
Peer upload capacity distribution 



Pf(x) 



M-rif,... 



"I I 



(35) 



Recall that the stationary distribution of the configurations 
for the original topology hopping Markov chain is p* : 
[p* f (x),f 6 T\- We use the total variance distance l36l to 



quantify the difference between p* and p, as 

dMp*,p) = - y \p}-pf\. 

We have the following result: 
Theorem 3: Let A max = max/g^A/, and 
maXf e jrXf. The djv{p*,p) are bounded as follows: 

< d TV (P*,p) < 1 - exp(-2/3A max ) . 

Further, the optimality gap in broadcast rates \p*x T 
bounded as below: 



(36) 



(37) 



px T \ 



\p*x T 



px I < 2* max (l - exp(-2y6A max )). 



(38) 



The proof is relegated to Appendi^jTCl 

Remarks: a) The upper bound on djvip* ,p) shown in (|37| | 
is general, as it is independent of the number of configurations 
\T\, the values of tif, and the distributions of inaccurate 
observed rates r\jj (-«/ < j < n/,f € T^. b) The upper bound 
on drv(p*,p) shown in (|37| | decreases exponentially with 
the worst inaccuracy bound A max decreasing, c) It would be 
interesting to explore a tighter upper bound on cItv(P*,P) than 
the one in ([37) ■ 

VI. Performance Evaluation 

We implement a packet-level simulator to our proposed 
solutions and use this simulator to evaluate the performance 
of our solutions. 

A. Settings 

In our simulations, time is chopped into slots of equal 
length, and we adopt three different settings. In Setting I, we 
set the total number of nodes to be 100, and assign the node 
upload capacities randomly according to the distribution in 
Table HH which is obtained from the uplink bandwidth statistics 
of Internet hosts l37l . We set the source's upload capacity to be 
768 kbps; with this upload capacity, source is not the broadcast 
bottleneck Q, 0. 

Setting II is the same as Setting I, except we set the total 
number of nodes to be 10. 

In Setting III, there are 4 different peering configurations 
as shown in Fig. [3] Every node has a unit capacity. Under 
configuration fa, fa and fa the maximum broadcast rate is 1, 
and under configuration fa the maximum broadcast rate is 0.5. 

When running our network coding based broadcasting al- 
gorithm, we set the updating step size of z and A to be 0.1 and 



Upload Capacity (kbps) 


64 


128 


256 


384 


768 


Fraction (%) 


2.8 


14.3 


4.3 


23.3 


55.3 



0.00005 respectively. These parameters are empirically chosen 
to obtain smooth algorithm updating and small errors. 

In our simulations, we assign node degree bounds in the 
following two ways. The first is to set identical bound on 
each node's node degree. The second is to set degree bound 
proportional to the node's upload capacity. This is based on the 
empirical observations that nodes with high upload capacities 
usually have more system resource (e.g., memory and CPU 
power) than nodes with low upload capacities. With more 
system resource, nodes can maintain more concurrent con- 
nections, thus have larger node degree bounds. In our second 
degree bounds assignment, nodes set their node degree bounds 
proportional to the ratio between their upload capacities and 
64 kbps. In particular, nodes with 64 kbps have a degree bound 
of 2, and nodes with 128 kbps have a degree bound of 4, etc. 

We carry out two sets of simulations. First, we evaluate the 
performance of our distributed broadcasting algorithm under 
Setting I and II. Second, we evaluate the overall performance 
when we combine the topology hopping algorithm and the 
broadcasting algorithm under Setting I and III. In these two 
sets of simulations, we also compare the performance under 
the two degree bounds assignments explained in the previous 
paragraph. 




Fig. 3. Peering configurations under Setting III. For the ease of illustration, 
we only allow node 1 to add or remove neighbors between nodes 2 and 4. 
The rest nodes keep their neighbors fixed. 



B. Evaluation of the Proposed Broadcasting Algorithm 

In this simulation, we evaluate our distributed broadcasting 
algorithm proposed in Section [III] We randomly choose a 
sub-graph that satisfies the node degree bounds constraints, 
and run our algorithm over it. We evaluate three aspects 
of the proposed algorithm: 1) does it converge to optimal 
broadcast rate as expected from theoretical analysis? 2) How 
fast does it converge? 3) How would different values of degree 




Fig. 4. Broadcasting algorithm evaluations, a) The source broadcast rate and average peer receiving rate under Setting II when degree bound is set to 3; 
b) The source broadcast rate and average peer receiving rate under Setting I when degree bound is set to 4; c) The source broadcast rate and average peer 
receiving rate under Setting I when degree bound is set to 10. d) This figure shows the impact of degree bound on the peer receiving rate under Setting I. 
The full-mesh rate is the maximum broadcast rate when the node degrees are unbounded [T]. 
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Fig. 6. Evaluation of our overall solution which combines the topology hopping algorithm and the broadcasting algorithm, a) The average peer receiving 
rate when the node degree bound is 3 and ft is 20; b) The average peer receiving rate when the node degree bound is 3 and ft is 50; c) The average peer 
receiving rate when peer degree bound is proportional to its upload capacity and ft is 20. The percentage of average receiving rate improvement of our overall 
algorithm against our broadcasting algorithm and the simple heuristic algorithm are shown in these three figures. For example, in (a), 22% means that the 
average receiving rate of our overall algorithm is 1.22 times of that of our broadcasting algorithm, and 550% means that the average receiving rate of our 
overall algorithm is 6.5 times of that of the simple heuristic algorithm. 
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Fig. 5. a) Optimal configuration distribution for different values of ft 
under Setting III; b) Configuration distribution obtained by our algorithm for 
different values of ft under Setting III. 



bounds affect the maximum broadcast rate? The results are 
summarized in Fig. [4] 

From Fig. |4f a) and Fig. HJb), we see that our broadcasting 
algorithm converges. It converges faster in the small size 
network as shown in Fig.UJa) than in the large size network as 
shown in Fig. Efb). From Fig. |4|d), we also see the converged 
rate when the node degree bond is 10 is very close to a 
theoretical upper bound - the optimal broadcast rate under 
no degree bounds computed according to H), ifTTl . Q. This 
suggests that our algorithm converges to the optimal broadcast 
rate. 

Under different degree bounds, the optimal broadcast rate 
varies. Fig. [3d) shows that the optimal broadcast rate increases 



when we increase the node degree bounds. We plot the 
CDF of peer receiving rates (after the broadcasting algorithm 
converges) for the case where degree bound is 4, 10, and 
proportional to the peer's upload capacity. It's seen that when 
the bound is 10, the obtained rate is close to the full-mesh 
rate, which suggests that we do not need a large degree bound 
to achieve close to the full-mesh rate. The obtained rate is also 
close to the full-mesh rate when degree bound is proportional 
to the peer's upload capacity. 

C. Evaluation of the Overall Solution 

Our overall solution, which combines the Markov chain 
guided topology hopping algorithm and the back-pressure and 
network coding based broadcasting algorithm, achieves the 
near optimal broadcast rate under arbitrary node degree bound 
and over arbitrary overlay graph. To evaluate its performance, 
we generate a sub-graph randomly, run our algorithms on 
every node, and evaluate the achieved broadcast rate. 

The topology hopping algorithm runs on top of the broad- 
casting algorithm. Under given topology, the broadcasting 
algorithm achieves the optimal broadcast rate. Nodes swap 
neighbors based on their observed receiving rate, thus chang- 
ing the topology from time to time. In the simulation, we run 
the broadcasting algorithm long enough so that it converges 
before the topology transits according to the Markov chain. 



This way, the overall algorithm converges to the close-to- 
optimal broadcast rate. 

In all simulations, we compare our overall algorithm with 
our back-pressure and network coding based broadcasting 
algorithm to illustrate the benefit of topology hopping, and 
with a simple heuristic algorithm introduced below to illustrate 
the benefit of our overall solution. Remind that no existing 
works solve the problem of streaming-rate maximization under 
general node degree bounds and over arbitrary topology we 
studied in this paper. 

The simple heuristic algorithm we compare our overall 
algorithm against is also composed of two parts: routing- 
based broadcasting algorithm and random topology hopping 
algorithm. In routing-based broadcasting algorithm, each peer 
evenly allocates its upload capacity to its neighbors. Given 
the topology and capacity allocation, a centralized routing 
strategy (e.g. spanning trees based solution) is used to achieve 
the best broadcast rate the system can support. Similarly, the 
random topology hopping algorithm runs on the top of the 
broadcasting algorithm. Every peer maintains a timer. When 
the timer of one peer expires, the peer randomly drops one 
active neighbor which is exchanging data with it, and then 
selects one random candidate from its feasible neighbor list 
and starts to exchange data with it. By doing so, we actually 
allow nodes running the simple scheme to have a node degree 
beyond the bounds. This relaxation gives the simple scheme 
more degree of freedom to optimize its performance. Overall, 
the topology changes randomly on the top under which peers 
use routing to exchange streaming data. 

Our first observation is that our overall scheme converges 
to the solution that theory predicts. We carry out simulations 
under Setting III. Under this setting the optimal broadcast rate 
is 1 . The optimal configuration solution to problem MRC - /3 
is calculated and shown in Fig. |5(a)| for different values of [}. 
We run the overall scheme for this specific case and show the 
empirical configuration distribution in Fig. |5(b)| Comparing 
the distributions in Fig. |5(a)| and Fig. |5(b)| we can see that the 
distribution obtained by our overall solution is very close to 
the optimal one. We also calculate the achieved broadcast rate 
under different values of (3. For /3 — 1,5 and 10, the broadcast 
rate is 0.917, 0.987, and 0.998 respectively. We see that with 
large f3, the achieved broadcast rate is close to the optimal 
value 1, as predicted by our analysis in Section HV1 

Next, we evaluate our overall solution under Setting I. 
In Fig. |6(a)| and Fig. |6(b)| the broadcast rates obtained are 
305 kbps and 312 kbps respectively. They are about 22% 
and 25% higher respectively than the broadcast rate 250 
kbps achieved by running the broadcasting algorithm over a 
randomly chosen topology. This demonstrates the advantage 
of performing topology hopping to optimize the configuration, 
as compared to only randomly choosing topology. 

By setting node degree bounds proportional to peers' upload 
capacity, nodes with higher upload capacity maintain more 
connections. From Fig. |6(c)| we observe that this flexibility 
offers a broadcast rate of 475 kbps. Although the additional 
gain of topology hopping is small under the specific P2P 



simulation settings (e.g., node uplink capacity distribution), 
we remark that our topology-hopping based algorithm is theo- 
retically guaranteed to achieve close-to-optimal streaming rate 
under arbitrary node degree bounds and P2P settings, while 
the broadcasting algorithm with random topology selection 
has no performance guarantee. Moreover, in practical P2P 
streaming systems, the node degree bounds are typically small. 
For example, in PPLive, the node degree bounds are 15- 
20 03], while the size of the system (i.e., total number of 
peers that are simultaneously watching the same channel) 
is usually hundreds of thousands. Thus, we suspect we can 
see substantial gain of topology hopping if our algorithm is 
implemented in such system with small node degree bounds, 
as suggested by our simulation results under small node degree 
bounds. 

From Fig. |6(a)[ Fig. |6(b)| and Fig. |6(c)| we observe that the 
average receiving rate of our overall algorithm is about 5.5-7 
times higher than that of the simple algorithm respectively. 
And also we can see from Fig. |6(a)| and Fig. |6(b)| our 
algorithm can achieve smoother streaming rate than the sim- 
ple algorithm because our algorithm optimizes the topology 
hopping and stays in the optimal topology while the simple 
algorithm hops among topologies randomly and arbitrarily. 

VII. Discussions and Future Work 

In this paper, we propose a distributed solution to achieve 
a near-optimal broadcast rate under arbitrary node degree 
bounds, and over arbitrary overlay graph. Our solution is 
distributed and consists of two algorithms that can be of 
independent interests. The first is a distributed broadcasting 
algorithm that optimizes the broadcast rate given a P2P 
topology. It is derived from a network coding based problem 
formulation and utilizes back-pressure arguments. It can be 
considered as the extension of the algorithm in [30) from link- 
capacity-limited underlay networks to node-capacity-limited 
overlay networks. The second algorithm is a Markov chain 
guided hopping algorithm that optimizes the topology, inspired 
by the Markov Approximation framework introduced in l20l . 

Assuming the underlying broadcasting algorithm converges 
instantaneously, the topology hopping algorithm converges to 
the optimal configuration distribution. When the broadcasting 
algorithm does not converge fast enough, the topology hopping 
Markov chain transits based on inaccurate observations of the 
maximum broadcast rates associated with the configurations. 
We show that the topology hopping algorithm still converges, 
but to a sub-optimal configuration distribution. We characterize 
an upper bound on the total variance distance between the 
optimal and sub-optimal configuration distributions, as well 
as an upper bound on the gap between the achieved and the 
optimal broadcast rates. We show that both bounds decreases 
exponentially as the bound on inaccuracy decreases. 

Using uplink bandwidth statistics of Internet hosts, our 
simulations validate the effectiveness of the proposed solu- 
tions, and demonstrate the advantage of allowing node degree 
bounds to scale linearly with their upload capacities. 



In the scenarios where network coding is not allowed, we 
can formulate the broadcasting problem in Subsection IIII-BI 
as a linear program to construct a feasible node capacity 
allocation so that the sum of rate of all spanning trees is max- 
imized lfl5l . which is solvable by centralized LP algorithms. 
Then we can design the overall algorithm in the following 
way. The overall algorithm is also composed of two separate 
algorithms: the spanning tree based broadcasting algorithm 
and the Markov chain guided hopping algorithm. The topology 
hopping algorithm is same as the one in Section [IV] which 
runs on the top of the broadcasting algorithm and guides 
the topology hopping. Compared to our distributed overall 
algorithm when network coding is applied, this algorithm 
is centralized making it unsuited for use in a dynamically 
changing systems. 

Two interesting future directions are as follows. First, the 
convergence rate of our solution is determined by the mixing 
time of the topology-hopping Markov chain, which can be 
substantial for large P2P systems. It is thus of great interest 
to explore the design of topology-hopping Markov chains 
that mix fast and at the same time allows distributed imple- 
mentation. Second, while our algorithms adapt well to peer 
dynamics, our theoretical analysis is for static scenarios. How 
to extend the analysis to dynamic scenarios such as those 
observed in practical P2P systems ll38l is another interesting 
future direction. 

Acknowledgement 

This work was partially supported by the General Research 
Fund grants (Project No. 411008, 411209, 411010) and an 
Area of Excellence Grant (Project No. AoE/E-02/08), all 
established under the University Grant Committee of the Hong 
Kong SAR, China. This work was also partially supported by 
two gift grants from Microsoft and Cisco. 

References 

[1] J. Li, P. A. Chou, and C. Zhang, "Mutualcast: an efficient mechanism 

for content distribution in a p2p network," in Proc. ACM SIGCOMM 

Asia Workshop, 2005. 
[2] L. Massoulie, A. Twigg, G. Gkantsidis, and P. Rodriguez, "Randomized 

Decentralized Broadcasting Algorithms," in Proc. IEEE INFOCOM, 

2007. 

[3] R. Kumar, Y. Liu, and K. Ross, "Stochastic fluid theory for p2p 
streaming systems," in Proc. IEEE INFOCOM, 2007. 

[4] Y. Cui, Y. Xue, and K. Nahrstedt, "Optimal resource allocation in overlay 
multicast," IEEE Trans. Parallel and Distributed Systems, vol. 17, pp. 
808-823, 2006. 

[5] S. Sengupta, S. Liu, M. Chen, M. Chiang, J. Li, and P. A. Chou, "Peer- 
to-peer streaming capacity," IEEE Trans. Information Theory, vol. 57, 
pp. 5072-5087, 2011. 

[6] M. Castro, P. Druschel, A. Kermarrec, A. Nandi, A. Rowstron, and 
A. Singh, "SplitStream: high-bandwidth multicast in cooperative envi- 
ronments," in Proc. ACM SOSP, 2003. 

[7] V. Padmanabhan and K. Sripanidkulchai, "The case for cooperative 
networking," in Proc. IPTPS, 2002. 

[8] D. A. Tran, K. Hua, and T. Do, "ZIGZAG: An efficient peer-to-peer 
scheme for media streaming," in Proc. IEEE INFOCOM, 2003. 

[9] N. Magharei and R. Rejaie, "PRIME: Peer-to-peer receiver-driven mesh- 
based streaming," IEEE/ACM Trans. Networking, vol. 17, pp. 1052- 
1065, 2009. 

[10] S. Liu, M. Chen, S. Sengupta, M. Chiang, J. Li, and P. A. Chou, "Peer- 
to-peer streaming capacity under node degree bound," in Proc. IEEE 
ICDCS, 2010. 



[11] C. Feng, B. Li, and B. Li, "Understanding the performance gap between 
pull-based mesh streaming protocols and fundamental limits," in Proc. 
IEEE INFOCOM, 2009. 

[12] L. Abeni, C. Kiraly, and R. L. Cigno, "On the optimal scheduling of 
streaming applications in unstructured meshes," in Proc. IFIP Network- 
ing, 2009. 

[13] M. Wang and B. Li, "R2: Random Push with Random Network Coding 
in Live Peer-to-Peer Streaming," IEEE Journal on Selected Areas in 
Communications, vol. 25, p. 1655, 2007. 

[14] X. Zhang, J. Liu, B. Li, and T. Yum, "CoolStreaming/DONet: A data- 
driven overlay network for efficient live media streaming," in Proc. IEEE 
INFOCOM, 2005. 

[15] M. Chen, M. Chiang, P. A. Chou, J. Li, S. Liu, and S. Sengupta, "Peer- 
to-peer streaming capacity: Survey and recent results," in Proc. Allerton 
Conference, 2009. 

[16] D. M. Chiu, R. W. Yeung, J. Huang, and B. Fan, "Can network coding 
help in p2p networks?" in Proc. IEEE NetCod, 2006. 

[17] M. Chen, M. Ponec, S. Sengupta, J. Li, and P. Chou, "Utility maximiza- 
tion in peer-to-peer systems," in Proc. ACM SIGMETRICS, 2008. 

[18] PPLive. [Online]. Available: http://www.pplive.com 

[19] UUSee. [Online]. Available: http://www.uusee.com 

[20] M. Chen, S. C. Liew, Z. Shao, and C. Kai, "Markov approximation for 
combinatorial network optimization," in Proc. IEEE INFOCOM, 2010. 

[21] R. Ahlswede, N. Cai, S.-Y. R. Li, and R. W. Yeung, "Network informa- 
tion flow," IEEE Trans. Information Theory, pp. 1204-1216, 2000. 

[22] L. Tassiulas and A. Ephremides, "Stability properties of constrained 
queueing systems and scheduling policies for maximum throughput in 
multihop radio networks," IEEE Trans. Automatic Control, vol. 37, pp. 
1936-1948, 1992. 

[23] N. McKeown, V. Anantharam, and J. Walrand, "Achieving 100% 
throughput in an input-queued switch," in Proc. IEEE INFOCOM, 1996. 

[24] A. Eryilmaz, R. Srikant, and J. Perkins, "Stable scheduling policies for 
fading wireless channels," IEEE/ACM Trans, on Networking, vol. 13, pp. 
41 1^124, 2005. 

[25] M. Neely, E. Modiano, and C. Rohrs, "Dynamic power allocation and 
routing for time-varying wireless networks," IEEE Journal on Selected 
Areas in Communications, vol. 23, pp. 89-103, 2005. 

[26] J. Liu, S. Rao, B. Li, and H. Zhang, "Opportunities and challenges of 
peer-to-peer internet video broadcast," in Proc. IEEE, 2008. 

[27] Z. Liu, C. Wu, B. Li, and S. Zhao, "Uusee: Large-scale operational 
on-demand streaming with random network coding," in Proc. IEEE 
INFOCOM, 2010. 

[28] X. Hei, C. Liang, J. Liang, Y. Liu, and K. Ross, "A Measurement Study 

of a Large-Scale P2P IPTV System," IEEE Trans. Multimedia, vol. 9, 

pp. 1672-1687, 2007. 
[29] P. Chou, Y. Wu, and K. Jain, "Practical network coding," in Proc. 

Allerton Conference, 2003. 
[30] T. Ho and H. Viswanathan, "Dynamic algorithms for multicast with 

intra-session network coding," IEEE Trans. Information Theory, vol. 55, 

pp. 797-815, 2005. 
[31] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge 

university press, 2004. 
[32] T. Ho, M. Medard, R. Koetter, D. Karger, M. Effros, J. Shi, and 

B. Leong, "A random linear network coding approach to multicast," 

IEEE Trans. Information Theory, vol. 52, 2004. 
[33] P. Sanders, S. Egner, and L. Tolhuizen, "Polynomial time algorithms 

for network information flow," in Proc. ACM Symposium on Parallel 

Algorithms and Architectures, 2003. 
[34] R. Srikant, The Mathematics of Internet Congestion Control. 

Birkhauser, 2004. 

[35] L. Georgiadis, M. Neely, and L. Tassiulas, Resource allocation and cross 

layer control in wireless networks. Now Pub, 2006. 
[36] P. Diaconis and D. Stroock, "Geometric bounds for eigenvalues of 

Markov chains," The Annals of Applied Probability, pp. 36-61, 1991. 
[37] C. Huang, J. Li, and K. W. Ross, "Can internet video-on-demand be 

profitable?" in Proc. ACM SIGCOMM, 2007. 
[38] F. Wang, J. Liu, and Y. Xiong, "Stable peers: Existence, importance, 

and application in peer-to-peer live video streaming," in Proc. IEEE 

INFOCOM, 2008. 

[39] F. Kelly, Reversibility and stochastic networks. Wiley,Chichester, 1979. 



Appendix 

A. Proof of Theorem [7J 

We use the following Lyapunov function 

veV deR 1 

where z* , X" are the saddle points of ©• 

By differentiating the Lyapunov function with respect to 
time we get 



We use the above two equations (l43l and (l44l to substitute 
the corresponding terms in the inequality d39l and then get 
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Next we check the value of (l45l l. ( f46b , ( l47l i respectively. 
First, the strict concavity of £/(•) implies 
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KKT conditions for z*,A* are shown as follows 
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and 



Since z*,/T are optimal solutions, they should satisfy the 
constraints of the problem MP. So we have 
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where /*is the optimal solution of MP. 
From equation d40l i. we obtain 
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By using the above equation(l42l. we can transform the terms 
in the inequality d39l as follows 
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Note that / is the solution of the following problem 
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which is equivalent to SSP. Since /'is also feasible, for (l46l 
we have 
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Now we focus on the term(|47l). According to 0411) . the 
following equality holds. 
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Note that /* is the solution of the following problem 
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Overall, we get 
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Let £ = {(z, /1)| V(z, A) = 0} and £ = {(z, X)\ g5J = 0, © = 
0, 67} = 0}. Since V(z, A) < g5) + © + gT} and <g5j < 
0, (HUi < 0, d47) < 0, we have £ c Q. Let A4 be the largest 
invariant set in £ By LaSalle's invariance principle (z(t), /1(f)) 
converges to the set M as f — > oo. Since A4 c £ c as 
r — > oo (z(t), /1(f)) satisfies 



z« = z 
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and 



(50) 

Further, in At, 2,; e « /ij,d(0 = 1/ (z*)- To see this, if this is 
not satisfied, then by (l20b we can see z(t) will not stay in z*, 
which is contradicted with (|49l . This concludes the proof. 

B. Proof of Proposition [7] 

By two conditions for state space structure of Markov chain, 
we know that all configurations can reach each other within 
a finite number of transitions, thus the constructed Markov 
chain is irreducible. Further, it is a finite state ergodic Markov 
chain with a unique stationary distribution. We now show that 
the stationary distribution of the constructed Markov chain is 
indeed (127V 

Now we verify that under the implementation, the state 
transition rate from / to /' satisfies (|29l . 

In our Markov chain design, we only allow direct transitions 
between two configurations if such transitions correspond to 
a single node adding a new neighbor or removing a neighbor, 
i.e., \Nf\Nf \ = 1 or \Nf\Nf\ = 1. We consider these two 
scenarios separately in the following. 

Let / — » f denote the event that when the timer expires 
the process will enter state f after leaving the current state 
/. The probability of this event is denoted by Pr(/ — * /'). 

When \Nf\Nf \ = 1, assuming Nj\Np = (v,w), the event 
f —* f can be divided into two disjoint events: the event that 
node v's timer expires, then node v selects node u to remove 
and remove it from its in-use neighbor set and the event that 
node u's timer expires, then node u selects node v to remove 
and remove it from its in-use neighbor set. Denote these two 
events by / v — u f and f u -v f . Let v - u be the event that 
node v selects node u and removes it from its in-use neighbor 
set and u - v be the event that node u selects node v and 
removes it from its in-use neighbor set. Now we calculate the 
probability of / v — u f and f u — y / respectively. 
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and 



state / is £ V ev ^ exp _1 (T)- With the probability Pr(/ 
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the process jumps to state / when leaving state /. So, the 
transition rate from state / to f is 
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Therefore, we have 
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With d27l we see that p f (x)-q f ,f> = p* f ,(x)-qf>j,Vf,f e !F, 
i.e., the detailed balance equations hold. Thus the constructed 
Markov chain is time-reversible and its stationary distribution 
is indeed ( f27l > according to Theorem 1.3 and Theorem 1.14 in 



When \Nf\Nf\ = 1, assuming N f >\N f = (v,m), similarly c. Proof of Theorem^ 



we divide f —* f into two disjoint events fv + uf and 
f u + v f . f v + u f' denotes the event that node v's timer 
expires, then node v selects node w to add and add it in its 
in-use neighbor set. fu + vf denotes the event that node w's 
timer expires, then node u selects node v to add and add it 
in its in-use neighbor set. Let v + u be the event that node v 
selects node u and adds it as one in-use neighbor and u + v be 
the event that node u selects node v and adds it as one in-use 
neighbor. Then we have 
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; Pr(v + w|v's timer expires) Pr(v's timer expires) 



We denote M as the original topology hopping Markov 
chain with exact broadcast rates, and M' as the corresponding 
extended Markov chain with inaccurately observed broadcast 
rates. For the convenience of expression, for all / e f,j 6 
{—/I/, ... ,/i/}, we use f to represent the state (/, Xf + — A/) 
in the extended Markov chain M', and 77/ to represent distri- 
bution of inaccurate observed rates r\jj. 

Therefore, given direct transitions between configuration / 
and /' in the original topology hopping Markov chain M, 
there are direct transitions between states fj and f' k (V/ e 
{-rif, . . . ,rif},lc 6 {-7!/', . . . ,77/'}) in the extended Markov 
chain M'. Following d32t and d33l . we have the corresponding 
transition rates 
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Therefore, we have 
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where 77/ = 1 and ^ k L„ f , V% = 1- 

Now we compute the stationary distribution of states for the 
extended Markov chain M' . By detailed balance equation, we 
have 
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Then we have 
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In our implementation, under configuration /, peer v counts 
down with rate ^ exp _1 (r). Therefore, the rate of leaving the 
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Consider an arbitrary state fo in the extended Markov chain 
M' , where / ef and / + /, /'■ Since state space of M' is 
connected, we can always find a path to connect fo and fo 
through a series of adjacent states /(l)o, • ■ • ,/(L)o, and /o = 
/(l)o,/(L)o = /o- Therefore, 
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By d63l and J66t . we know that V/ e T, 
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By Wi\ , d68l > and ( |69l ), we obtain the stationary distribution 
of states for the extended Markov chain M' as follows: 
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The stationary distribution of peer configurations in the 
extended Markov chain M' is the probability distribution of 
aggregate states fj, j e {-«/, . . . , «/}, i.e., 
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Then by d75l l, we have a < exp(/?A max ). Therefore, 
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This concludes the proof. 



